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ASYMPTOTIC GENEALOGY OF A CRITICAL BRANCHING 
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By Lea Popovic 
University of California, Berkeley 

Consider a continuous-time binary branching process conditioned 
to have population size n at some time t, and with a chance p for 
recording each extinct individual in the process. Within the family 
tree of this process, we consider the smallest subtree containing the 
genealogy of the extant individuals together with the genealogy of the 
recorded extinct individuals. We introduce a novel representation of 
such subtrees in terms of a point-process, and provide asymptotic 
results on the distribution of this point-process as the number of 
extant individuals increases. We motivate the study within the scope 
of a coherent analysis for an a priori model for macroevolution. 


1. Introduction. The use of stochastic models in the theory of macroevo- 
lution (origin and extinction of species) has been common practice for many 
years now. Stochastic models have been used to recreate phylogenetic trees 
of extant taxa from molecular data, and to recreate the time series of the 
past number of taxa from the fossil record. However, only a few attempts 
have been made to make the two analyses consistent with each other. Instead 
of studying data-motivated models (which are scientifically more realistic for 
specific applications), the purpose of this paper is to study a purely random 
model that can accommodate such a coherent analysis. We study a mathe¬ 
matically fundamental stochastic model which allows for inclusion of both 
the extant and the fossil types of data in one analysis. 

A significant interest in evolutionary biology is devoted to reconstructing 
phylogenies based on available data on the extant species (of molecular or 
other type). The assumption is that in the distant past there was a common 
ancestor from which the extant species evolved according to some (stochas¬ 
tic) evolutionary model. One then tries to find the ancestral history (geneal¬ 
ogy) of the extant species which optimizes some “best fit” criterion. Using 
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shapes of such phylogenetic trees one then hopes to make some inference 
on the diversification properties of the evolutionary process. We stress here 
that the number of extant species is given, although the time from the ori¬ 
gin of the evolutionary process to the present often is inferred from the data 
as well. On the other hand, inference of diversification rates based on fossil 
data mostly makes use of time series analyses of fossil counts. Once again 
one assumes an underlying (stochastic) evolutionary process, then tries to 
use fluctuations in the time series of fossil counts to make inferences on the 
diversification rates of the process. For the most part, however, only a small 
fraction of the species are retained within the fossil record, with variation in 
sampling rates over time. It is often hard to estimate precisely the propor¬ 
tion of species retained as fossils, although its variability may be reasonably 
captured by considering the sampling rate to be random as well. 

The motivation for this paper was to consider a stochastic process which 
would incorporate both sets of data within one evolutionary model, and 
to present results describing the genealogy of the fossil record as well. Our 
basic premises are the given information on the number of extant species, the 
amount of time from the origin of the process to the present and the chance 
of species to be retained in the fossil record. It is subsequently possible 
to randomize the amount of time from the origin to the present day, as 
well as the chance of being retained in the fossil record. Details of such 
randomization under a reasonable choice of priors can be found in [17]. 

The model we propose is the continuous-time critical branching process. 
The reasons for our choice are the following. If one is to consider a model in 
which extinctions and speciations are random without systematic tendencies 
for the number of species to increase or decrease, then for a branching process 
this translates into the criticality of the process (the average number of 
offspring of each individual is 1). Such a model corresponds to one general 
view in evolutionary biology that (except for mass extinctions and their 
aftermath) the overall number of species does not have exponential growth 
nor an exponential decrease. 

The fundamental critical branching processes previously employed in evo¬ 
lutionary models have drawbacks that exclude their use in our proposed 
study. The basic evolution model is the Yule process [20], the elementary 
continuous-time pure birth process [the process starts with one individual, 
each individual gives birth to offspring according to a Poisson(rate 1) pro¬ 
cess]. One can clearly not employ this model, as it a priori does not involve 
the extinction of species, hence does not allow for inclusion of the fossil 
record. The next candidate model, which includes the extinction of indi¬ 
viduals, is the basic neutral model used in population genetics. The Moran 
model [7] is the process of uniformly random speciations and extinctions of 
individuals in a population of a fixed size [the total number of individuals is 
a fixed number, each individual lives for an Exponential(mean 1) lifetime, at 
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the end of which it is replaced by an offspring chosen uniformly at random 
from the total population including itself]. One can consider this process 
as having persisted from a distant past to the present, giving implicitly a 
genealogical tree of the extant individuals. Asymptotically in the total pop¬ 
ulation size (with suitable rescaling) this genealogical process (backward in 
time) is Kingman’s coalescent model. Although it is possible to make mod¬ 
ifications of this model to allow for nonconstant population size [11], this 
unfortunately requires an a priori assumption on the evolution of the total 
population size in time. 

We are interested in considering a group of species that have some com¬ 
mon ancestor at their origin. This corresponds to the practice in evolution¬ 
ary biology of considering monophyletic groups. In this sense, the critical 
continuous-time binary branching process, in which individuals live for an 
Exponential(mean 1) time during which they produce offspring at Pois- 
son(rate 1) times, is the natural basic model for the given purpose. We want 
to study the genealogical structure of the process conditioned on its popu¬ 
lation size at a given time t. By genealogical structure we mean a particular 
subtree of the branching process family tree. We consider all the extant in¬ 
dividuals at time t, and the subset of the extinct individuals each having 
independent chance p of being sampled into the record. The genealogical 
subtree is the smallest one containing all the common ancestors of the ex¬ 
tant individuals and all the sampled extinct individuals. We introduce a 
point-process representation of this genealogical subtree, with a convenient 
graphical interpretation, and derive its law. Our main result is the asymp¬ 
totic behavior of such point-processes and their connection to a conditioned 
Brownian excursion. 

The relationship between random trees and Brownian excursions has been 
much explored in the literature. We note only a small selection that is di¬ 
rectly relevant to the work in this paper. Neveu and Pitman [14, 15] and 
Le Gall [12] noted the appearance of continuous-time critical branching pro¬ 
cesses embedded in the structure of a Brownian excursion. Abraham [1] and 
Le Gall [13] considered the construction of an infinite tree within a Brownian 
excursion, which is in some sense a limit of the trees from the work of Neveu 
and Pitman. The convergence of critical branching processes conditioned on 
total population size to a canonical tree within a Brownian excursion (the 
continuum random tree) was introduced by Aldous [3]. We state a connection 
of the asymptotic results in this paper with the above-mentioned results. 

Some aspects of the genealogy of critical Galton-Watson trees conditioned 
on nonextinction have been studied by Durrett [6], without the use of ran¬ 
dom trees. It has also been studied within the context of superprocesses 
(see, e.g., [13]). We further note that, in the biological literature, models of 
evolution have been made on each level of taxonomy (species, genera etc.) 
separately, while it is certainly desirable to insure hierarchical consistency 
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between them. A natural way to extend our analysis to the next taxonomic 
level is to superimpose on the branching process a random process of marks 
distinguishing some species as originators of a new higher taxon. In collab¬ 
oration with Aldous, we have pursued this study in [4], as part of a larger 
project on coherent and consistent stochastic models for macroevolution. 

As a last remark, we note that, as implied by general convergence results 
on critical branching processes ([3] and many others), the same asymptotic 
genealogical process obtained here should invariably hold in general for any 
critical branching process with finite offspring variance. 

The paper is structured in two parts. In Section 2 we give a precise def¬ 
inition of the genealogical point-process representing the common ancestry 
of the extant individuals and provide its exact law and asymptotic behavior 
(Theorem 5). Then, in Section 3 we give the definition of the corresponding 
genealogical point-process that includes the sampled extinct individuals as 
well, and we provide its exact law and asymptotic behavior (Theorem 9). 

2. Genealogy of extant individuals. Let T be a continuous-time critical 
branching process, with initial population size 1. In such a process each 
individual has an Exponential (rate 1) lifetime, in the course of which it gives 
birth to new individuals at Poisson(rate 1) times, with all the individuals 
living and reproducing independently of each other. Let 7) n be the process 
T conditioned to have population size n at time t. We shall use the same 
notation (T and %, n ) f° r the random trees with edge-lengths that are the 
family trees of these processes. 

We depict these family trees as rooted planar trees with the following 
conventions. Each individual is represented with a set of edges whose total 
length is equal to that individual’s lifetime. Each birth time of an offspring 
corresponds to a branch-point in the parent’s edge, with the total length of 
the parent’s edge until the branch-point equal to the parent’s age at this 
time. The new individual is then represented by the edge on the right, while 
the parent continues in the edge on the left. Such trees are identified by their 
shape and by the collection of the birth times and lifetimes of individuals. 
We shall label the vertices in the tree in a depth-first search manner. An 
example of a random tree realization of Tt )U is shown in Figure 1(a). 

Remark 1. The random tree T we defined is almost the same as the 
family tree of a continuous-time critical binary-branching Galton-Watson 
process. The difference between the two is only in the identities of the in¬ 
dividuals. If, in the Galton-Watson process, at each branching event with 
two offspring we were to impose the identification of the left offspring with 
its parent, the resulting random tree would be the same as the family tree 
of our branching process T. 
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Fig. 1. (a) A realization of the tree whose population at time t is n — 5; the leaves 

are labeled in depth-first search manner, (b) The contour Cr t „ process of the tree Tt, n ; 
each local maximum of Cr t n corresponds to the height of a leaf ofTt, n . 


Let Ct be the contour process induced by the random tree T. The contour 
process of a rooted planar tree is a continuous function giving the distance 
from the root of a unit-speed depth-first search of the tree. Such a process 
starts at the root of the tree, traverses each edge of the tree once upward 
and once downward following the depth-first search order of the vertices 
and ends back at the root of the tree. The contour process consists of line 
segments of slope +1 (the rises) and line segments of slope —1 (the falls). 
The unit speed of the traversal insures that the height levels in the process 
are equivalent to distances from the root in the tree, in other words to the 
times in the branching process. The contour process induced by the random 
tree 7) jn depicted in Figure 1(a) is shown in Figure 1(b). For a formal 
definition of planar trees with edge lengths, contour processes and their many 
useful properties one can consult the recent lecture notes of Pitman ([16], 
Section 6.1). 

Let the genealogy of extant individuals at time t be defined as the smallest 
subtree of the family tree which contains all the edges representing the an¬ 
cestry of the extant individuals. The genealogy of extant individuals at t in 
Tt,n is thus an n-leaf tree, which we denote by QiTt.nj- Figure 2(a) shows the 
genealogical subtree of the tree from Figure 1(a). We now introduce a novel 
point-process representation of this genealogical tree G{%,n)- Thus we get 
an object that is much simpler to analyze and gives much clearer asymptotic 
results than if made in the original space of trees with edge-lengths. 











Fig. 2. (a) The genealogical tree G{%,n) of the extant individuals at time t. (b) The 

point-process TIt,n representation ofG(Tt,n) [the dotted lines show the simple reconstruction 
of G(Tt,n) from its point-process]. 


Informally, think of forming this point-process by taking the heights of 
the branching points of the genealogical tree Q{Tt,n) hi the order they have 
as vertices in the tree. For convenience (in considering asymptotics with t 
increasing) we keep track of the heights of the branching points in terms of 
their distances from level t. The vertical coordinate of each branching point 
is thus its distance below level t, while its horizontal coordinate is just its 
index. The point-process representation of G(7t,n) from Figure 2(a) is shown 
in Figure 2(b). Formally, let aj, 1 < i < n — 1, be the times (distance to the 
root) of branch-points in the tree G(7t, n ), indexed in order induced from 
the depth-first search of the vertices in 7) n , let t * = t — cti be their distance 
below level t and let i % = i. 

Definition 2. The genealogical point-process Ilt^ n is the random finite 
set 

(1) n t , a = {(^,tj) : 1 < i < n- 1, 0 < ti < t}. 

For practical purposes it is most useful to exploit the bijection between a 
random tree and its contour process. We can obtain the point-process n t ^ n 
equivalently from the contour process €% n as follows. The ith individual 
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extant at t corresponds to the pair ( Ui,Di ): of the ith up-crossing time Ui 
of level t by the contour process Cr t}U and of the zth down-crossing time Di 
of level t. A precise definition of U and Di will be given by (3) and (4) in 
the proof of Lemma 3. The branch-points ai,l < i < n — 1, of Q{fTt,n ) then 
correspond to the levels of lowest local minima of the excursions of Ct ± „ 
below level t, in other words a* = inf{Cr tin (u) ■ Dj <u < Ui+i}. 

We next use this observation together with the description of the law of 
Cj- tn to obtain the law of IIt t n- We first recall the result of Neveu, Pitman 
and Le Gall, regarding the law of the contour process Cj- of an unconditioned 
random tree T (one can consult either [12] or [14] for its proof). 

Lemma 1. In the contour process Cj- of a critical branching process T 
the sequence of rises and falls (up to the last fall) has the same distribution 
as a sequence of independent Exponential(rafel) variables stopped one step 
before the sum of successive rises and falls becomes negative (the last fall is 
then set to equal this sum). 

The following corollary is an immediate consequence of Lemma 1 and the 
memoryless property of the exponential distribution. 

Corollary 2. For the contour process Cr the process Xr = (Cr, slope[Cr]) 
is a time-homogeneous strong Markov process on M + x {+1,-1} stopped 
when it first reaches (0,-1). 

The law of the genealogical point-process Ut n can now easily be derived 
using some standard excursion theory of Markov processes. Note that the 
contour process of a whole class of binary branching processes can be shown 
to be a time-homogeneous Markov process as well (see [9] ). In the following 
lemma we show that the distances of the n — 1 branching points below level 
t are independent and identically distributed, with the same law as that of 
the height of a random tree T conditioned on its height being less than t. 

Lemma 3. For any fixed t > 0, the random set IIt^ n is a simple point- 
process on {1,..., n — 1} x (0 ,t) with intensity measure 

. . ,, . dr 1 + t 

( 2 h,»(Wxdr) = —- —. 

(1 +T) Z t 

In other words, ti, 1 <i <n — l, are i.i.d. variables on (0,f) with the law (2). 

Proof. In short the proof relies on the following. The contour process 
Cq- of an unconditioned tree T is, by the previous corollary, a Markov process 
considered until a certain stopping time. Hence, its excursions below some 
level t are independent and identically distributed. Conditioning of the tree 
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% n translates simply in terms of its contour process, into conditioning this 
Markov process to have exactly n — 1 excursions below t until this stopping 
time. Further, for the law of these excursions it will follow, by the sign 
invariance of the law of Ct, that their law is the same as that of a copy of 
Cj- conditioned to have a height less than t. 

Consider the Markov process Xq- = (Ct, slope[C 7 -]) until the first hitting 
time 17 ( o,_i) = hrf{tt > 0: Xt(u ) = (0, —1)}, and consider its excursions from 
the point (t,+l) using the distribution of Ct given by Lemma 1. For i > 1 
let 17* be the times of the up-crossings of level t by Ct, 

(3) U o = 0, Ui = ini{u>Ui-i:Xr(u) = (t, +1)}, i> 1. 

Clearly Pp,+i) [inf{it > 0: X-j-(u) = ( t , +1)} > 0] = 1; hence the set of all vis¬ 
its to (t, +1) at times {Ui,i > 1} is discrete. The excursions of Xj- from level 
t are, for i > 1, 

, n _ / X r {Ui + u), for u E [0, — Ui), 

l(0,+l), else. 

The number of visits in an interval [0, u] is 

7(0) = 0, ((u) = sup{i > 0: u > Ui}, u > 0, 

and the total number prior to t/(o,_ 1 ) is L = supjz > 0: C/( 0 ,_i) > Ui } = 
£(U( o,-i))- If n is the P(t,+i)-law of ej, and if £ <f is the set of excursions 
from (i, -hi) that return to (t,+l) without reaching (0, —1), and £ >f the set 
of all others, then it is clear that (e.g., [19], 2, Section VI.50) the following 
hold: 

W P (ll+ i)[I>il = [i(£ <, )l i - 1 , i > 1, and ei, e 2 ,... are independent; 

(b) given that L>i , the law of ei, e 2 ,..., e*_i is n( •C £<*)/!!(£<*); 

(c) given that L = i , the law of e* is n(- n £ >t )/n(5 >4 ). 

This makes {(7(17*), e^), 1 < i < L — 1} a simple point-process [note that 
(.(Ui) = i, and 7(oo) = L\, with the number of points having a Geometric(n(5 > *)) 
law, and with each e* having the law n (-C £<*)/!!(£<*). 

This observation is particularly convenient for analyzing the law of Cq- t n . 
Since Cr t n is just C r conditioned on L = n, the n — 1 excursions of Cr t n below 
t are independent identically distributed with the law n (•n£<*)/n(£<*). We 
next derive the law of their depth a* measured as distance from level t by 

ti — t CLi . 

For each up-crossing time 17* of level t, we have a down-crossing time 

(4) Di = inf{u > 17*: Xq-(u) = (t, — 1)}, i> 1. 

For the values of a*, i > 1, we are only interested in the part of the excursions 
from (t,+l) below level t, 

ef * = ei(Di + u), u G [0, ?7* + i — Dj) and e^(it) = (0, +1) else. 
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We note that the shift and reflection invariance of the transition function of 
Ct, as well as its strong Markov property, applied to the law n for ef f imply 
that the law of ef =t — ef 1 is the same as the law of Xj- conditioned to return 
to (0,-1) before reaching (t, +1). Consequently the law of i — inf(e^*) = 
sup(e+) is the same law as that of sup(Cr) conditioned to be less than t. 

To explicitly express the law of sup(Cr) we now recall classical results for 
the branching process T (e.g., [8], Section XVII.10.11), by which the law of 
the population size N(t) of T at time t is given by 

(5) P[N(t) = 0] = T ^- t , P[N(t) = k]= ^^ k+1 for k>l. 

Hence 

(6) P[sup(Cr) > t] = P[N(t) > 0] = ——— for t > 0. 

Now for Cq- t n and for each 1 < i < n — 1 we have that ai = inf(e) cf ), and the 
ef t are independent with ef 4 ~ n(- n £ <4 )/ n(£ <4 ) ; hence each tj = t — a,i has 
the law 


( 7 ) 


P[ti G dr] 


P[sup(Cr) £ dr \ sup (Cr) < t] 


dr 1 + t 
(1 + r) 2 t 


for 0 <t <t. 


□ 


Asymptotics for I7( jn could now be established with a routine calculation. 
Instead of considering this result in isolation, it is far more natural to view 
it as part of the larger picture connecting critical branching processes and 
Brownian excursions. Let us recall the asymptotic results for critical Galton- 
Watson processes conditioned on a “large” total population size. A result of 
Aldous ([3], Theorem 23) says that its contour process (when appropriately 
rescaled) converges, as the total population size increases, to a Brownian 
excursion (doubled in height) conditioned to be of length 1. Note that, if 
IVtot is the total population size of a critical Galton-Watson process, and 
N(t n ) its population size at some given time t n , then the events {N to t = n} 
and {N(t n ) = n\N(t n ) > 0} are both events of “small” probabilities. The 
first has asymptotic chance cnA 3 / 2 as n —> oo, and for {t n } n >i such that 
t n /n—*t as n —> oo the second has asymptotic chance [3]. While 

the total population A" t ot size corresponds to the total length of the contour 
process, the population size N(t n ) at a particular time t n corresponds to 
the occupation time of the contour process at level t n . Hence, it is natural 
to expect that the contour process of a critical Galton-Watson process con¬ 
ditioned on a “large” population at time t n (when appropriately rescaled) 
converges, when t n /n—*t as n —> oo, to a Brownian excursion conditioned 
to have local time 1 at level t. 
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We will show the following. Consider a Brownian excursion conditioned 
to have local time 1 at level t, as a “contour process” of an infinite tree (in 
the sense of the bijection between continuous functions and trees established 
in [3]). Consider defining a “genealogical” point-process from this Brownian 
excursion, using the depths of its excursions below level t, in the same man¬ 
ner as used in defining n t)n from the contour process 0% , except that the 
excursions are now indexed by the amount of local time at level t at their 
beginning. The state-space of such a point-process can be simply described, 
and we show that it has quite a simple law as well. It is then easy to show 
that this point-process is precisely the asymptotic process of appropriately 
rescaled processes n tn: n as n—> oo. 

We construct a point-process from a Brownian excursion conditioned to 
have local time 1 at level t, in the same manner in which /7< in was constructed 
from the contour process Cr t . Let B(u), u > 0, be a Brownian excursion. 
For a fixed t > 0, let £t(u), u > 0, be its local time at level t up to time 
u with the normalization of local time as one-half the occupation density 
relative to Lebesgue measure (the normalization choice is analogous to the 
upcrossings-only count for the contour process Cr)- Let it(£), £ > 0, be the 
inverse process of £ t , in other words it(£) = inf{u > 0: £t(u) > £}■ Let Bt.i{u), 
u > 0, then be the excursion B conditioned to have total local time it equal 
to 1, where it = it{ oo) is the total local time at t. Consider excursions ef* of 
Bt.i below level t indexed by the amount of local time £ at the time it(£~) of 
their beginning. For each such excursion let ag be its infimum, and let tg be 
the depth of the excursion measured from level t, tg = t — ag. Ito’s excursion 
theory then insures that the process {(£,tg) :it(£~) / it(£)} is well defined. 

Definition 3. The continuum genealogical point-process TTt,i is the ran¬ 
dom countably infinite set 

(8) 7Tt,i = {(£,tg):i t (£~) ^it(£)}- 

Remark 4. The name of the process will be justified by establishing it 
as the limit of genealogical point-processes. 

For the state-space of the continuum genealogical process we introduce 
the notion of a nice point-process ([3], Section 2.8). A nice point-process on 
[ 0 , 1 ] x ( 0 , oo) is a countably infinite set of points such that the following 
hold: 

1 . for any 8 > 0, [0,1] x [5, oo) contains only finitely many points; 

2 . for any 0 < x < y < 1 , 5 > 0 , [x,y\ x (0,<5) contains at least one point. 

We next show that the state-space for 7^1 is the set of nice point-processes, 
and we establish the law of this process using standard results of Levy, Ito 
and Williams on excursion theory. 
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Lemma 4. The random set irt,i is a Poisson point-process on [0,1] x (0, t) 
with intensity measure 

dr 

(9) iAdi x dr) = dl—r. 

In particular, the random set TTt,.\ is a.s. a nice point-process. 

Proof. The crux of the proof lies in the following observations. An 
unconditioned Brownian excursion B observed from the first time it reaches 
level t is just t —a standard Brownian motion observed until the first time it 
reaches t. The excursions of B below level t are thus the positive excursions 
of the Brownian motion. By a standard result, the process of excursions of 
Brownian motion from 0, indexed by the amount of local time at 0 at the 
time of their beginning, is a Poisson point-process with intensity measure 
d£ x n, where n is Ito’s excursion measure. One can show that the condition 
on B to have local time 1 at level t is equivalent to the condition that the 
shifted Brownian motion has all its excursions until local time 1 of height 
lower than t and has one excursion at local time 1 higher than t. This then, 
by the independence properties of Poisson processes, allows for a simple 
description of the point-process of the depths of excursions below t of Bt„\ 
as a Poisson process itself, except restricted to the set [0, 1] x (0 ,t). 

Consider the path of an (unconditioned) Brownian excursion B after the 
first hitting time of t, Ut = inf{it > 0 :B(u) = t}, shifted and reflected about 
the it-axis 

(10) (3{u) = t — B{Ut + u) for u > 0. 

Let £q(u), u > 0, be the local time of f3 at level 0 up to time u, and 
let io(£), ^ — 0) be the inverse process of this local time, in other words 
ig(i) = inf{it > 0:£q (u) > £}. Then the process /3(u), u > 0, is a standard 
Brownian motion stopped at the first hitting time of t, Uf = inf{u > 0: (3(u) = 
t}. 

Next, the excursions of /3 from 0 are (with a change of sign) precisely the 
excursions of B from t, and the local time process £q of f3 is equivalent to 
the local time process £± of B. We are only interested in the excursions of 
B below t, which are the positive excursions of /3 indexed by £ such that 
ig(£~) / io(£) and sup {(3(u) :zq(£~) < u < Iq{£)} > 0, 

ef = /3(ig (£~) + u), u £ [0,i(j(£) — ig(£-)) and ef(u) = 0 else. 

Note that we thus have that the infimum of an excursion of B below t to be 
simply inf (e^*) = t — sup(e^). 

Standard results of Ito’s excursion theory (e.g., [19], 2, Section VI. 47) im¬ 
ply that for a standard Brownian motion (3 the random set {{£, sup(ej)~)): Iq(£~) / 
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ig (£)} is a Poisson point-process on R + x R + with intensity measure d£ dr/r 2 
(recall our choice for the normalization of local time). 

Now let L = ml{£ > 0: sup(e^ )>t)}. Then stopping £{j at the hitting time 
L is equivalent to stopping (3 at its hitting time . Let 7q be a random set 
defined from the unconditioned Brownian excursion B, in the same manner 
in which we defined Tit l from a conditioned Brownian excursion Bt.\- Then, 
using the relationship (10) of B and /?, we observe that -Kt is equivalent to a 
restriction of {(■£,sup(ej)")) :zq(£~) / Zq(£)} on the random set [0,L] x (0,7). 
The Poisson point-process description of {(.7,sup(ej!~)) :«o(7 _ ) /*o(-7)} now 
implies that Tit is a Poisson point-process on R + x R + with intensity measure 
d£dr/T 2 restricted to the random set [0, L] x (0, t). 

Next, note that the condition {£t = 1} for B is equivalent to the condition 
{£q (Ut) = 1} for /3, which is further equivalent to the condition {L = 1} for 

Tit■ We have thus established that nt,i = vr t |{L = 1}. 

Further, the condition {L = 1} on Tit is equivalent to the condition that 
Tit has no points in [0,1) x [f,oo) and has a point in {1} x [7,oo). However, 
since Tit is Poisson, independence of Poisson random measures on disjoint 
sets implies that conditioning Tit on {L = 1} will not alter its law on the 
set [0,1] x (0, f). However, since 7r tj i is supported precisely on [0,1] x (0,7), 
the above results together imply that 71^1 is a Poisson point-process on 
[0,1] x (0, t) with intensity measure d£dr/T 2 . 

It is now easy to see from the intensity measure of 77 * 1 that its realizations 
are a.s. nice point-processes, namely: 

(a) for any 5 > 0, // [0 ,i] x [<5,00) d£dr/r 2 = 1/5 < 00 ; 

(b) for any 0 < x < y < 1, and 6 > 0, d£dr/T 2 = (y - x) ■ 00 . 

Also, since 7^1 is Poisson, finiteness of its intensity measure on [0,1] x [<5, 00 ) 
implies that it has a.s. only finitely many points in the set [0,1] x [<5, 00 ), 
while infiniteness of its intensity measure on [x,y\ x (0,5) implies that it has 
a.s. at least one point on the set [ x,y\ x (0,5). □ 

Having thus obtained the description of the continuum genealogical point- 
process induced by a conditioned Brownian excursion, it is now a simple task 
to confirm that it indeed arises as the limit of genealogical processes. The 
right rescaling for 7) jn is to speed up the time by n and to assign mass 
n~ 1 to each extant individual, which implies the appropriate rescaling of 
each coordinate of 17f n by n -1 . We hence define the rescaled genealogical 
point-process as 

(11) fi~ l II tin = {(n~ 1 £i,n~ 1 ti): (£i,U) <E 77^} 

and establish its asymptotic behavior as n—» 00 . 
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Theorem 5. For any {t n > 0} n >i such that t n /n —> t we have 


— 1 7 r d 

n n t n ==>■ 7Tf 1 . 
n— >oo 


Remark 5. The notation is used to mean weak convergence of 
processes. 


Proof of Theorem 5. The proof of the theorem is a just consequence 
of the fact that weak convergence of Poisson point-processes follows from 
the weak convergence of their intensity measures. 

By Lemma 3 and the rescaling (11) we have that n~ l F[t n , n is a simple 
point-process on {1/n,..., 1 — 1/n} x (0,t n /n) with intensity measure 


( 12 ) 


1 ndT 1 + tr 

2^{</»}(*) (1 + „ r ) 2 tn 


n 


i =1 


If {£n}n>i is such that i n /ra—as n —* oo, then it is clear that the sup¬ 
port set of the process n~ l IIt njn converges to [0,1] x (0,f), the support set 
of the process Tit ,l • It is also clear that the intensity measure (12) con¬ 
verges to dldr/r 2 , which, by Lemma 4, is the intensity measure of TTt,i- 
For simple point-processes this is sufficient (e.g., [5], Section 12.3) to insure 
weak convergence of the processes n~ l n tn , n to a Poisson point-process on 
[0,1] x (0,f) with intensity measure dldr/T 2 . By Lemma 4, we thus have 

that n~ l IIt nt n □ 


3. Genealogy of sampled extinct individuals. We now want to extend 
the analysis of the ancestry of extant individuals to include some proportion 
of the extinct individuals as well. Suppose that each individual in the past 
has independently had a chance p of appearing in the historical record. We 
indicate such sampling of extinct individuals by putting a star mark on 
the leaf of corresponding to the recorded individual. An example of 
a realization of such p-sampling is shown in Figure 3(a), and the induced 
sampling in the contour process is shown in Figure 3(b). 

The goal is to combine the information on the sampled extinct individuals 
with our analysis of the ancestry of the extant ones. To do so we extend 
our earlier notions of the genealogy of the extant individuals and of the 
genealogical point-process. 

Let the p-sampled history of extant individuals at time t be defined as the 
smallest subtree of the family tree which contains all the edges represent¬ 
ing the ancestry both of the extant individuals and of all of the p-sampled 
extinct individuals. We denote the p-sampled history of extant individuals 
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at t in T t) n by G P {%,n)- Note that by definition G p (T t , n ) contains the ge¬ 
nealogy G(T tj n) (which would correspond to a 0-sampled history). It is in 
fact convenient to think of G P {%,n) as consisting of the “main genealogi¬ 
cal tree” G(T^ n ) and a collection of “p-sampled subtrees” attached to this 
main tree linking with additional branches the ancestry of p-sampled extinct 
individuals. Figure 4(a) shows the p-genealogical subtree of the tree from 
Figure 3(a). We next extend the notion of the genealogical point-process to 
represent this enriched p-sampled genealogy. We construct a point-process 
representation of G p (Tt,n) so that it contains as its “main points.” 

Informally, think of extending the point-process 17; n [representing t/(7) in )], 
by adding sets representing the p-sampled subtrees as follows. At each 
branch-point of the main tree there is a set of p-sampled subtrees attached 
to the edges of the main tree on the left of this branching point, and a 
set of p-sampled subtrees attached on the right of this branching point [see 
Figure 4(a)]. We associate with each branch-point at height a* a left set 
and a right set TZ, , which shall represent these sets of subtrees. Each such 
Ci and IZj needs to contain the following information: the heights a l p(j) 
and a tR (j) at which the /^-sampled subtrees get attached to the edges of 
the main tree [as before we shall keep track of these heights as distances 
from level t in terms of t i}L (j) =t- a i:L (j) and t itR (j) = t- a i>R (j)]-, and the 
shape of the subtrees Tj^j) and T i, R (j) themselves (the indexing j > 0 on 
the subtrees is induced by a depth-first search forward to the branch-point 
at cij for the left sets and a depth-first search backward to the branch-point 



Fig. 3. (a) The tree Tt in with p-sampling on its individuals (the sampled individuals 

are represented by *’sj. (b) The contour process of this tree with the sampling on the 
corresponding local maxima. 
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at a* for the right sets). The point-process representing the p-sampled ge¬ 
nealogical tree from Figure 4(a) is shown in Figure 4(b). To describe the 
law of the p-subtrees it will also be convenient to keep track of the height 
K,l{J ) and h i>R (j) of the subtrees T ijL (j) and T i)R (j). 

Formally, we define the point-process of G P {Tt, n ) from the contour pro¬ 
cess C-r t „ • The p-sampling on the tree is represented by the sampling of the 
local maxima of C Rt n . From the definition of 17j n , we have the heights of the 
branch-points of Q(Tt,n) to be a* = inf{C 7 - t n (u) :Di <u < Ui + 1 }, occurring 
in the contour process n at times B, = arg min{Cr tjri (u): u G ( Di , Ui + \)}. 
The set Li, representing the set of p-subtrees attaching to the edges of £/(7) jn ) 
on the left of the branch-point a*, is defined from the part of the excursion of 
C-r t ri below t before time Bi. In other words if, for X Rt n = (Cr t>ri , slope[Cr ti J), 
we define 

e tl( u ) = x Tt, n (. D i + u ), u e [0 ,Bi - Di), 

then Li is completely dehned by ef[. Analogously 1Z L is dehned from the 
part of the excursion of C Rtn below t after time Bf, in other words if we 
define 


e f!k( u ) = x T t ,„(U i+ i - u ), u G [0, U i+ 1 -B^, 



Fig. 4. (a) The p-sampled tree Q v (Tt,-n); the “main tree” (in bold) has the u p-sampled 

subtrees” attached to it. (b) The point-process representation n of G p (Tt, n ); each of 
the “main points” (large dots) has an associated left set and a right set representing the 
p-sampled subtrees attaching to the left and right of that branch-point. 
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then it is completely defined by ef R (the subscripts L and R reflect whether 
the entities are involved in defining £* or TZi). Note that the ef R runs forward 
up to time Bi, whereas ef R runs backward. On the extreme ends, we have 
the set of p-subtrees on the left of the first branching point defined by the 
part of CT t prior to the first up-crossing time U\, ^(u) = X Rtn {U\ — u ), 

u E [0, U\). Analogously, the set of p-subtrees on the right of the last branch¬ 
ing point is defined by the part of Cr tn after the last down-crossing time 
D n , = X Tt n (D n + u), u£ [0, J7 (0 _i) - D n ), where = infju > 

0:X rt ; = (0,-l)}. 

To define the sets £* and 1Zi we also need to define the processes 

e£[(v), u E [0, B { - A), 

<*,r( u ) = n inf e fA V ^ u e [°> ^*+1 “ B i)- 

0 <V<U ’ 

The bijection between the tree and its contour process Cr tj „ implies 
that the heights at which the p-subtrees are attached to the edges of the 
main tree are precisely the levels of constancy of the processes q.l and q : r. 
Furthermore, the p-subtrees themselves have as their contour processes the 
excursions of ef R — and ef R — above these levels of constancy (see 
[16] for a detailed description). Figure 5 shows ef ! r together with its infimum 
process 

We define (Mj.i'i), j > 0, to be the successive levels of constancy of 
and let U t L(J) = t — a,i,L(j) be their distance from level t. For each level 
of constancy let ef R (j) be the excursion of ef f L — that lies 

above the level f-E./,(.()■ Let hij y (j) be the height of this excursion, = 

sup^^j^j)), and let T i t L,(j) be the tree whose contour process is the excur¬ 
sion efi(j)- Figure 5 shows an excursion ef f L (j ) with the p-subtree T i^{j) 
it defines. Note that all the star marks due to p-sampling are contained in 
the excursions ef R {j), hence are contained in the subtrees Tj z,(j). An anal¬ 
ogous definition leads to a i>R (j), j > 0, h i>R (j) and T i>L (j) from e^ R (j) and 
Si, R (j). With each point (£j,fj) of n t , n we now associate the sets 

(13) = and = 0 . 

In addition, for extreme ends we define one set IZq from t Rl and we define 
a set C n from e^ L . For ease of future notation we set Cq = 0, lZ n = 0, 
(4,t) = (1 ,t) and (£ n ,t n ) = ( n,t ). 

Definition 6. The p-sampled historical point-process n is the random 
set 

(14) 


-t,n = ■ (£i,ti) E n t>n , 0 <i<n}. 
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Fig. 5. The left half eff of an excursion of Cr t , n below t, with its infimum process 
whose levels of constancy are , above which lie the p-marked subtrees {T i,L(j)}j 

of heights {hi, L (j)}j- 


Remark 7. We have in fact implicitly defined a point-process represen¬ 
tation 'E*t,n °f a complete historical point-process (which would correspond 
to 1-sanrpling). The difference between and Ef n is only in the *’s on 
the leaves in the latter. It will, however, be clear that for nice asymptotic 
behavior we need to consider E^ n with p < 1: in other words we can only 
keep track of a proportion of the extinct individuals. 

We can now derive the law of the point-process E^ n . For this we shall also 
need the law of the p-subtrees appearing in the sets £* and 1 7*. Let T denote 
the space of finite rooted binary trees with edge-lengths, and let A denote 
the law on T of the tree T. Then, let A p denote the law on T induced by 
the p-sampling on the tree T. Further, for any h > 0, let A p h denote the law 
induced by restricting A p to the trees T of height h. 

To describe the law of Sf we use a more careful and detailed analysis of 
the structure of the contour process Cp tn . First we use the result of Lemma 3, 
which gives us the law of the main points of Ef n . Then conditional on 
the location of the main points, we give the law of the sets Hi and 7*b of 
p-subtrees. We show that the sets Hi and IZi are independent Poisson point- 
processes. The intensity measure of each such set is given by the following. 
First, choose the distances below t at which the p-sampled subtrees 

are getting attached, uniformly over ti, the total distance below t to the ith 
branch-point. Next, choose hi t L(j), the height for each p-subtree, according 
to the same law as that of the height of a tree T whose height is known to 
be less than Finally, choose the law of Tthe attaching subtree, 

according to the law A p h described above. 

Lemma 6. For any fixed 0 < p < 1, the law of the random set n is 
given by the following: 
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(a) {(£i,ti) : 1 < i < n — 1} is the simple point-process Llt^n of Lemma 3; 

(b) given {(4, L), 1 < i < n— 1}, the sets Li and IZi are independent; and 
for each 0 < i < n the random sets Li and IZi are Poisson point-processes 
on R + x T with intensity measure 

dh 1 +1 v 

(15) l{o <t<ti} dt l{ 0 </i<t} ^ + —j—A h . 

Proof. The proof relies on exploiting the alternating Exponential(rate 
1) step structure of the contour process 0% n • From Lemma 3 we have that 
the excursions of Cr t n below t are independent and that their law is the same 
as that of Cr conditioned on having height less than t. We further show that, 
when decomposed into the part before its lowest point and a part after it, 
the two parts of these excursions are conditionally independent given the 
excursion’s depth. In fact, if the former is run forward to the lowest point, 
and the latter backward to the lowest point, these two parts have the same 
conditioned law. We obtain a simple description of the law of the levels of 
constancy of the infimum process for these parts, and the excursions above 
these levels of constancy are shown to be copies of Cr restricted in their 
height. 

Independence of the pairs of sets Li, IZi over the index i follows from 
the independence of the excursions ef l of Xr t n below level t as shown in 
Lemma 3. The strong Markov property of Xr also gives the independence 
of LZq and L n from all the pairs Li,IZi. The proof of Lemma 3 also shows 
that the law of the excursions t — ef t is the same as that of Xr conditioned 
on sup(C 7 ") < t. The left half t — ef l L of such an excursion is defined as the 
part of t — ef* until it reaches its maximum, and the right half t — as 
the part after this maximum, run backward in time (the u-coordinate). To 
derive the conditional law of Li,IZi given tj we thus need to analyze the 
conditional law of the two parts of Xr on either side of its maximum, given 
the maximum’s value. 

Let us first consider the process Xr continued past its first hitting time 
of (0,-1). Let Sr be its maximum process 

Sr{u) = sup Cr(v), u> 0, 

0 <v<u 

and let us consider the process ( Sr,Sr~Cr ), which clearly completely 
describes Xr ■ The process [Sr , Sr — Cr) consists of an alternating sequence 
of the following: rises of slope 1 for Sr paired with the intervals at 0 for 
Sr — Cr, and levels of constancy for Sr paired with the excursions from 0 
for Sr — Cr- Figure 6(a) shows a decomposition of Xr into Sr and Sr — Cr- 
Because the alternating steps of Cr are independent Exponential(rate 1) 
variables, it is not difficult to see that the alternating steps of (Sr, Sr — Cr) 
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are independent, and that the rises of Sr all have the same Exponential(rate 
1) distribution while the excursions of Sr — Cr all have the same distribution 
as Xt [stopped when it first hits (0,-1)]. Namely, the first rise of Sr is 
just the first rise of Cr', the Xr law and independence of a subsequent 
excursion of Sr — Cr is immediate from the law of Cr', and, finally, the 
Exponential(rate 1) law and independence of a subsequent rise of Sr is just 
a consequence of the memoryless property of rises of Cr- 

Consider now the point-process {(s,£ s )} of levels of constancy of Sr 
paired with the excursions of Sr — Cr below them. The above analysis shows 
that {s(u), u > 0} is a Poisson(rate 1) process, and the excursions e s all 
have the same law as Cr- We shall denote the law of Cr by n (as in the 
proof of Lemma 3). Then {(s,e s )} forms a Poisson point-process with inten¬ 
sity measure dsn. Note that it was shown in (6) that the height of Cr, and 
hence the height of these excursions, is given by n(sup(-) > h) = 1/(1 + h), 
for h > 0. 

We now consider how the maximum of Xr in the time interval before 
it first hits (0,-1) appears within this point-process {(s,e s )}. We denote 
the first hitting time of (0,-1) by = inf {t:Xr = (0,-1)}, and the 

maximum of interest by M = sup{Xr(u): u G (0, L/(o,_ 1 ))} • Then, one can 
easily note that s(C/( 0 ,_i)) = M and that Vs G (0,M), h(e s ) < s whereas for 
s = M, Ii(£m) > M. Figure 6(b) depicts a realization of {(s,e s }). In other 
words, the point ( M,£m ) is the first point (in terms of the s-coordinate) of 
the process {(s, e s )} which falls outside the set {(s, h s ) : s > 0, h s > 0, h s < 
s}. Independence of Poisson random measures on disjoint sets then implies 
that, given the value of M = sup{Xr(u): u G (0, t/^-i))}, the conditional 
law of the process {(s,e s ): s < M} is independent of the point (M, Em) and 







Fig. 6. (a) The process Xt continued past its first hitting of (0,-1), its maximum process 

St and the excursion process St — Ct below the levels of constancy of St; M is the 
maximal value of Xt before it first hits (0,-1). (b) The point-process {(s,/i(e s ))} of the 
values of constancy of St, paired with the heights of excursions of St — Ct below them. 
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has the law of a Poisson point-process with intensity measure 

dh 


lo<s<M ds lo</i<6 


(1 + h) 


r n ( • | sup (-) = h). 


It is important now to note that the process {(s,e s ):s < M} completely 
describes the part of Xq before it reaches the maximum M, while the point 
(. M,£m ) completely describes the part of Xq after it reaches M and before 
it first hits (0, —1). 

We can now tend to the quantities of interest within the excursions t — ef 1, . 

We have that the conditional law of t — ef l L , t — ef R given ti is the same 
as the conditional law of the parts of Xq before and after its maximum M 
given that M = ti. Within t — ef l L the levels of constancy U^ij) and the 
associated excursions t — efqij) above these levels precisely correspond to 
the levels of constancy s and its associated excursions e s within the process 
{(s,e s ): s < ti}. Moreover, t — ef R precisely corresponds to the part of the 
excursion Em before it hits (0, —1), reversed in time (cf. Figures 5 and 6). 

From our analysis above it thus follows that {(U,L(j),, that is, t — 
ef} and t — ef R are conditionally independent given ti, and that e fi(j))} 

has the law of a Poisson point-process with intensity measure 


H-o <t<ti dt lo <h<t 


dh 


(1 + h) 


r n ( ■ |sup(-) = h). 


Now the strong Markov property implies that the p-sampling on the local 
maxima of the whole contour process Cq t n is for each again a Bernoulli 

p-sampling on its local maxima. Thus the conditional law of the p-sampled 
tree T^(j) defined from efj(j) given the height hi^ij) = sup(ef£(j)) is 
^hi, L U)' Putting all the above results together we have that given ti the 
random set Li = {(U^ij)i,L(j))}j> o is conditionally independent of the 
set 7 Zi, and its law is a Poisson point-process with intensity measure 


dh 


i{0<t<ti} dt 1{0 <h<t} ^ ^h • 


The same conditional law of 1Z, follows from time reversibility of the law 
of 

□ 


Let us now consider the implications that the p-sampling of extinct indi¬ 
viduals has in the asymptotic context. In Section 2, the genealogical point- 
process was defined from the contour process Cq t n , and its asymptotics was 
identified as the continuum genealogical point-process similarly defined from 
a Brownian excursion Bt \ conditioned to have local time 1 at level t. Now the 
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p-sampled historical process is defined from a contour process n whose lo¬ 
cal maxima are sampled independently with equal chance p. In terms of the 
(horizontal) u-coordinate of Cr t „ the p-sampled individuals form a random 
set of marks on R + . The fact that Cj- is an alternating sum of independent 
Exponential(rate 1) random variables implies that the random set formed 
by the local maxima of Cj- is a Poisson process of rate 1/2 on R -1- , and 
the same still holds for the sets formed by the local maxima of each part 
of an excursion of Cj- t n below t. If we further sample these local maxima 
independently with chance p we have a Poisson process of rate p/2 on R + . 
Now, for the asymptotics, the appropriate rescaling, as in Section 2, speeds 
up the time axis of Cj- t n by n. Hence if we consider p n such that np n —> p 
as n —> oo, then asymptotically the p n -sampling on Cr tn will converge to a 
Poisson process of rate p/ 2. This prompts us to consider for the asymptotics 
of the p-historical point-process a process similarly defined from a condi¬ 
tioned Brownian excursion Bt. i sampled according to a Poisson(rate p/2) 
process along its (horizontal) u-coordinate. 

Remark 8. We are interested in obtaining an asymptotic point-process 
that has a.s. finitely many extinct individuals recorded. It is clear that thus 
the rate of sampling asymptotically has to satisfy np n —» p as n —» oo. 

We define a process derived from a conditioned Brownian excursion Bt\ 
in the same manner that was derived from the contour process of the 
conditioned branching process C% . Recall that B(u),u> 0, denotes a Brow¬ 
nian excursion, for a fixed t > 0; £t(u), u > 0, is its local time at level t up to 
time u\ it{£), £ > 0, is the inverse process of £f Also, Bt,i(u), u> 1 denotes 
the excursion B conditioned to have total local time at t equal to 1, and 
(£, ef*) denotes the set of excursions of B t ,i below level t indexed by the 
local time £g at the time of their beginning. 

Define the p-sampling on Bf \ to be a Poisson(rate p/2) process along the 
u-axis of Bt, i- We indicate this by putting a star mark on the graph of B t ,i 
at the times of this Poisson process. Let ef* be an excursion of Bt ,i below 
level t 


ef t (u)=Bt,i(it{£ ) + u), u G [0, i t (£) — i t (£ )). 

Recall that ag = inf (ef 1 ) is its lowest point occurring at ug = argmin(e <t (u)), 
and that tg = t — ag denotes its distance from level t. For each ef* we define 
its left and right parts (relative to its lowest point) to be 

e f*L( u ) = Bt,i(k(£~)+u), 

e f}i( u ) =B t ,i(k(£) ~u), 


u G [0 ,ug - i t (£ )) 
u G [0 ,i t (£) - ug ). 
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Note that ef f L runs forward to the lowest point of ef l , whereas ef R runs 
backward in time to it. We shall also need their respective processes of infirna 

%l(u)= inf u e [0,ug - 

0 <v<u ’ 

%r{u) = inf ef R (v), u G [0,i t (£) - ug). 

0<v<u ’ 

Figure 7 shows ef* L and ef R with qg^ and ^trt- 

We define ag t L(j), j > 0, to be the successive levels of constancy of ql, 
and we let tg^{j) = t — apx(j) be their distance to level t. For each level of 
constancy ag^j), let ef f L (j ) be the excursion of ef* L — sg,L that lies above the 
level ag t i,(j)- Let hg^{j) = supbe the height of this excursion. Note 
that a.s. all the p-sampled points on Bt.i lie on these excursions ef^j). We 
define a tree T^l(j) induced by such a p-sampled excursion ef l L {j) as the 
tree whose contour process is the linear interpolation of the sequence of the 
values of ef^j) at the p-sampling times, alternating with the sequence of 
the minima of ef l L (j ) between the p-sampling times. An analogous definition 




Fig. 7. (Top) An excursion ef 4 of Bt,i below t, its left ef^ and right parts, with 
their infimum processes; (bottom) the process ef f L — <u,l ■ 
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leads to ag tR (j),j > 0, tg tR (j), j > 0, hg jR (j) and T e, L (R) from ef* R (j) and 

Remark 9. This definition of a tree from an excursion path sampled 
at given times has been explored for different sampling distributions in the 
literature (for some examples see [16], Section 6). Since for each ef t there 
are a.s. only finitely many p-sampled points the trees {Tg t L(j)}j, {^i,R(j)}j 
are a.s. in the space T of rooted planar trees with edge-lengths and finitely 
many leaves. 

With each point (£,tg) of 71^1 we now associate the sets 

(16) Ci = {(t £ , L {j), T^zXj))}^ and Kg = {(tg jR (j), T^(j))} y > 0 . 

We also define the first “right” set IZo and the last “left” set C\ from paths 
ef l R of Bt, 1 before the first hitting time of t, and ef\ of Bt y i after the last 
hitting time of t. For ease of notation we let Cq = 77 1 = 0, to = t\ = t. 

Definition 10. The p-sampled continuum historical point-process 
is the random set 

(17) £f,i = {(£,tg,Cg,TZg):(£,tg) it{£~) 1 

We next derive the law of the point-process { . For this we shall also need 
the law of the trees induced by the p-sampled excursions of e Kt — g. Let A p 
denote the law on the space T induced by a B sampled at Poisson(rate p) 
points (in the sense of the bijection between sampled continuous functions 
and trees [3], same as the definition of Tg ;R {j) from the p-sampled e^x(j)). 
Then, for any h > 0, let \ p h denote the law induced by restricting X p to the 
set of Brownian excursions B of height h. 

To derive the law of we exploit in a more detailed manner the nice 
properties of Brownian excursions. We first use the result of Lemma 4, which 
gives us the law of the set {(£,ti) :it(£~) / (•£)}. Then conditional on this 
set we give the law of the sets Cg and TZg. We show that {Cg, are 
independent Poisson point-processes. The intensity measure of each such 
set is given by the following. First, choose tgj^j), the distances below t at 
which the p-sampled subtree excursions of ef ( — q ^ occur uniformly over 
tg, the distance below t of the lowest point of ef f . Next, choose hg t i,(j), the 
height for each such p-sampled excursion, according to the same law as that 
of the height of a B whose height is known to be less than tg r(j)- Finally, 
choose the law of the induced tree T£,l(j) according to the law described 
above. 

Lemma 7. The random set l is such that the following hold: 
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(a) {(£,ti) ^it(£)} is the Poisson point-process 71^1 of Lemma 4; 

(b) given {{l, tf ): / it(£)} the sets Cg and 1Zg are independent; and 

for each £: it(£~) i=- itif), C,g and IZg are Poisson point-processes on R + x T 
with intensity measure 

(18) l{o <t<t t } dt l{o</i<t} h 2 

Proof. The proof proceeds in many of the same steps as the one for de¬ 
riving the law of the p-sampled historical process n . The notable difference 
is that we now have to resort to more sophisticated Markovian results on the 
decomposition of a Brownian path, such as the Williams decomposition of a 
Brownian excursion given its height, and the Pitman theorem on Bessel pro¬ 
cesses. In short, we consider the decomposition of the conditioned Brownian 
excursion Bt.\ into its excursions below level t provided by the Lemma 4. 
For each such excursion below t given its lowest point at distance tg below 
t, the Williams decomposition gives us the independence and identity in law 
of its left and right parts, as well as the description of their laws in terms of 
a three-dimensional Bessel process. Furthermore, we can use Pitman’s the¬ 
orem that describes the law of the excursions of this Bessel process above 
the levels of constancy of its future infimum. After taking care of some con¬ 
ditioning issues, this finally gives us a simple description of these excursions 
above the levels of constancy as simply Brownian excursions conditioned on 
their maximal height. 

The independence of the sets Cg over the index £ (the same holds for 
the sets IZg) follows from the independence of the excursions of Bt t i below 
level t. This also holds (by the strong Markov property of B) for the sets 
TZq and defined from the parts of the path of Bt t i of its ascent to level t 
and its descent from it. For each ef l excursion of Bt \ below level t, we let 
e\ = t — ef 1 . By Lemma 4, the conditional law of ef given (£, tg) is that of a 
Brownian excursion B conditioned on the value of its supremum £>|{sup(£>) = 
tg}. Let Tt e = inf{u > 0: e\ ( u) = t#}\ then by the Williams decomposition of 
a Brownian excursion B (e.g., [19], 1, Section III.49), the law of e^ L = t — ef * L 
is that of a Bess(3) (three-dimensional Bessel) process p stopped the first 
time r/( = inf{u > 0: p(u) = t^} it hits tg. By time reversibility of B the 
process 

rg,L (u) =t e - e ^ L ( T tl -u ) , u G (0, T te ), 

also has the law of the stopped Bess(3) process p(u), u E (0 ,rf). Let 

je,L(u)= inf rgL, «e(0,r t< ). 
u<v<rt £ 

Then {tg — tgjfj)}^ are (in reversed index order) the successive levels of 
constancy of the process je,L(u ), «£ (0,rt £ ); {hg t L(j)}j (in reversed index 
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order) are the heights of the successive excursions from 0 of the process 
re,L(u) - xl(u), u G (0 ,r te ), and {T 'e, L (j)}j (in reversed index order) are 
the trees induced by the p-sampled points on these excursions. To obtain 
the law of jgj_, and rg t L — jg,L consider the Bess(3) process p(u),u > 0, and 
its future infimum process j{u) = inf v > u p(v), u > 0. We note that the law 
of je,L{u), u G (0 ,r tl ), is equivalent to that of j(u), u G (0,r£), if j(t/^) = tg] 
in other words, if p(u), u > 0, after it first reaches tg never returns to that 
height again. So, 

(. U,L, re,L - je,L ) = {j,P~ j)\ {j(t£) = k} for u G (0, T tl ). 

By Pitman’s theorem, then by Levy’s theorem (e.g., [18] , VI, Sections 3 and 6) 

(. 3,P~J ) = (C,C ~0) = (£,\P\), 

where {3 is a standard Brownian motion, ( its supremum process; \(5\ is a 
reflected Brownian motion, l its local time at 0 (with the occupation time 
normalization). Thus, for f tl := inf{« > 0: \/3\ u + £ u = tg}, 

(. U,L , rg jL - j £, L ) = {£, \(3\)\{£f tt = k} for u G (0, T h ). 

The condition {£ ft = tg} is equivalent to the condition {£f tf = tg, \P\n e = 0} 
and {u < f tp :£ u < tg, \j3\ u <tg- £ u }. Hence, 

( 19 ) (je,L,rg } L ~ jg,L ) = (£, \P\)\{£u < k , |/ 3 | u <tg — £ u \ £f t( = tg, \P\ fte = 0 }. 

Since {£, sup(|/3|)) is a Poisson point-process with intensity measure didh/h 2 , 
then using the independence property of a Poisson random measure on dis¬ 
joint sets in (19), we obtain for t = tg — £ that {tg — jg t L, sup(r^£ — jg^)) is 
a Poisson point-process with intensity measure 

dh 

1(0 <t<ti) “ i l(0</i<I)T2 • 

Recall the relationship of the values {k,L{j),hg t L{j),^g,L{j)}j of Cg with 
the processes jg.L and rg t L — jg,L- The above result thus implies that Cg is a 
Poisson point-process with intensity measure 

dh p 

where the last factor comes from the fact that T g l(J) is just the tree induced 
by the p-sampled excursion of |/3| of height hgjXj)- □ 

Our next goal is to show that the process 1 whose law we have just ob¬ 
tained is indeed the asymptotic result of the processes E^ n after appropriate 
rescaling. To do so, we first must show that the laws K p ^ on the space of 
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trees converge as n —> oo to the law X p h if np n —> p. We need to consider more 
closely the trees T^lO) and Y e,L,(j) induced by the sampled excursions ap¬ 
pearing in the historical point-processes above. In both cases we have an 
excursion, Cj- or B, of a given height and with marks on it produced by a 
sampling process. Laws of the trees induced by sampled excursions of unre¬ 
stricted height can be very simply and elegantly described (see [10] for the 
case of B). However, for the trees from excursions of a given height that we 
need to consider here, the description is much messier. We shall give next 
a recursive description that applies equally to define an Y i } lU) from Cr 
of a given height, or to define Y^l(j) from B of a given height. A similar 
recursive description of an infinite tree induced by an unsampled Brownian 
excursion is given by Abraham and Mazliak [2], 

Define the “spine” of the tree to extend from the root of the tree to the 
point of maximal height in the excursion. An equivalent representation of 
the tree is one in which the subtrees of the trees on the left and on the 
right of the axis through the spine are attached to this spine, an example 
of which is shown in Figure 8. We obtain the branch levels at which these 
subtrees are attached, as well as parameters needed for the description of 
the subtrees as follows. 



Fig. 8. The “first” set in the recursive description consists of branch levels {tL(j)}j at 
which subtrees induced by sampled excursions of eL — ^L are attached to the spine; and the 
heights {/iz,(j)}j of these subtrees. 
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We denote the excursion function defining this tree by e(u), u> 0 (in other 
words e = Cr or e = £>). Let h be its given height, and Uh = argmax{e('u): u > 
0} the time at which it is achieved. Then let ez(u), u £ [0, L^], be the 
left part of the excursion, and we also define its future infimum process 
<7 l(u) = inf„> M e(u), u £ [0, Uh\. Then the subtrees attaching on the left of 
the spine are defined by the process ez — ?z and the set of sampled marks. 
They are precisely the trees induced by the sampled excursions ez(j) of 
ez — sl whose height is some h z (j ). The levels at which they are attached 
to the spine are the levels of constancy tz(j) of ?z at which the excursions 
of ez — ?z occur. Thus the set {(iz0’)>^z(j))}j>o is the “first” set in our 
recursive definition of trees. The “second” set is derived in the same man¬ 
ner from the sampled excursions {ez(j)}j and so on. We define these sets 
analogously for the right part of e. 

This recursive procedure is clearly very similar to our definition of the 
left and right sets £ ? ;, TZ t for ef f and for ef* as defined earlier. The 

main difference is that the subtrees here are defined from excursions above 
the levels of constancy of the future infimum process for e, whereas earlier 
they were defined from excursions above the levels of constancy of the past 
infimum process for ef l and ef*. However, time inversion and reflection 
invariance of the transition function of e will allow us to easily derive the 
laws of the “first” set of points here from the results of Lemmas 6 and 7. 
In the next lemma we give a recursive description of the law of A?" and 

, and we show that we do have the convergence of the (appropriately 
rescaled) to X p h if np n —* p. 


Lemma 8. The law of a tree induced by a p n -sampled contour pro¬ 
cess Ct of a given height h is such that the first sets of points {f z(j) : hz(j)}j 
and {tn(j),hii(j)}j are independent Poisson point-processes with intensity 
measure 

1 dn 1 + r 

( 2 °) —l {0<T<h) dT l ( o<*<h-r) ^ + k ) 2 ~ ■ 


The law \ p h of a tree induced by a p-sampled Brownian excursion B of 
a given height h is such that the first sets of points {iz(.?), hh{j)}j and 
hR,(j)}j are independent Poisson point-processes with intensity mea¬ 
sure 


( 21 ) 


1 dK 

~^=^-{ p < T < h ) dr 1(o<k</i—t)^2 ■ 


Let n~ 1 A^ n be the law of the tree induced by a rescaled p n -sampled contour 
process Cr by n~ l in the vertical coordinate. Then for any {p n £ (0, l)}n>i 
such that np n —» p we have n” 1 A?" ==> A?. 

n—>oo n n—>oc 1 
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Proof. The key for this proof is to observe the following. If e(u), u> 0, 
is the p n -sampled process AVI {sup(Cr) = h}, then ez,(tt) = e(u), u G [0, Uh\, 
has the law of a jj n -sampled Xq-\ {r/,, < ro}, where r/i,ro are the first hitting 
times of (h, +1), (0, — 1), respectively, by Xq-. Then time reversibility and 
the reflection invariance of the transition function of Xq- imply that h — 
£L{Uh — re), re G [0, Uh\, has the same law as eq(u), re G [0,14], Now the 
levels of constancy of and the corresponding excursions eq — sl above 
them, are equivalent to the levels of constancy and excursions of a set 
considered in Lemma 6, thus giving a Poisson process of intensity measure 
as in (15). The factor p -1 / 2 in the intensity measure (20) comes from the 
fact that here we only consider the excursions of eq — sq that have at least 
one sampled mark in them. Namely, for the branching process T, if lV tot 
denotes the total population size of T, then the generating function of N tot 
is E(x Ar * ot ) = 1 — (1 — x) 1 / 2 . Hence, the chance of at least one mark in the 
p n -sampled point-process of T is 1 — E((l — p n ) Ntot ) = Pn 1 ^ 2 - 

A similar argument applies when e(re), re > 0, is the process £>|{sup(£>) = 
h} sampled at Poisson(rate p/2) times. Time reversibility and reflection 
invariance of the transition function of B allow us to identify that the law 
of the levels of constancy of sq, and the corresponding excursions eq — <,q 
above them, are the same as those for a set Ln considered in Lemma 7, 
which we know form a Poisson process with intensity measure as in (18). 
The factor p -1 / 2 in the intensity measure of (21) then comes from the rate 
of excursions with at least one sampled mark. Namely, a Poisson(rate p/2) 
process of marks on B along its time coordinate is in its local time coordinate 
a Poisson(rate p 1 / 2 ) process of marks (see [19], 2, Section VI.50). 

Now the law of the first set of the rescaled process with converges 

to the law of the first set of the process with the law . This follows from 
the fact that the former is a sequence of Poisson point-processes whose 
support set and intensity measure converge to those of the latter Poisson 
point-process. Since for Poisson random measures the convergence of finite¬ 
dimensional sets is sufficient to insure weak convergence of the whole process 
our claim follows for the first sets, and by recursion for the whole process. 
□ 


Finally, we can obtain the asymptotic result for the p n -sampled historical 
point-processes. The rescaling of Bl/)) n is the same as that for /7^ n . Both 
coordinates of are rescaled by re” 1 , so that the vertical coordinate of 
the sets £,;,77,; is also rescaled by re -1 , and the sampling rate is rescaled by re. 
Hence the rescaled process is defined as 


( 22 ) 


re 


— lcPn _ 


= {(re li,n Vi, re L Ci,n L lZi): (li,Ti, £i,1Zi) G 


The asymptotic properties of the rescaled p-sampled historical process are 
now easily established from our earlier results. 
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Theorem 9. For any {t n > 0} n >i and {p n G (0, l)} n >i such that t n /n —> 

— — n —>oo 

and np n —» p we have =>• . 

n—>oo n ’ n — kx) ’ 

Proof. By Theorem 5 we already have that n _1 77t n ==k Tt+i- Ap- 
plying the rescaling to the results of Lemma 6 together with the result of 
Lemma 8 now implies that the support set and intensity measure of the 
Poisson point-process of each Ci after rescaling converges to those of the 
Poisson point-process Fg as given by Lemma 7. Then the convergence of the 
support set and intensity measure for the Poisson random measure to 
those of ^ j implies the weak convergence of these processes. □ 
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